Pattern recognition methods for advanced stochastic protein sequence analysis using HMMs

نویسندگان

  • Thomas Plötz
  • Gernot A. Fink
چکیده

Currently, Profile Hidden Markov Models (Profile HMMs) are the methodology of choice for probabilistic protein family modeling. Unfortunately, despite substantial progress the general problem of remote homology analysis is still far from being solved. In this article we propose new approaches for robust protein family modeling by consequently exploiting general pattern recognition techniques. A new feature based representation of amino acid sequences serves as the basis for semi-continuous protein family HMMs. Due to this paradigm shift in processing biological sequences the complexity of family models can be reduced substantially resulting in less parameters which need to be trained. This is especially favorable when only little training data is available as in most current tasks of molecular biology research. In various experiments we prove the superior performance of advanced stochastic protein family modeling for remote homology analysis which is especially relevant for e.g. drug discovery applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Review of Hidden Markov Models in Face Recognition

Hidden Markov Models (HMMs) are a set of statistical models used to characterize the statistical properties of a signal. An HMM is a doubly stochastic process with an underlying stochastic process that is not observable, but can be observed through another set of stochastic processes that produce a sequence of observed symbols. An HMM has a finite set of states, each of which is associated with...

متن کامل

Binary pattern recognition using Markov random fields and HMMs

In this paper we present a stochastic framework for the recognition of binary random patterns which advantageously combine hmms and Markov random elds (mrfs). The hmm component of the model analyzes the image along one direction, in a speci c state observation probability given by the product of causal mrf-like pixel conditional probabilities. Aspects concerning de nition, training and recognit...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Finding Genes by Hidden Markov Models with a Protein Motif Dictionary

A new method for combining protein motif dictionary to gene nding system is proposed. The system consists of Hidden Markov Models (HMMs) and a dictionary. The HMMs represents the nucleotide acid bases, the codons, and the amino acids. The 'words' in the dictionary is described by the sequence of these HMMs and represent the noncoding regions, the codons, protein motifs, tRNA regions and signals...

متن کامل

Modeling of Pen-Coordinate Information in SCPR-based HMM for On-line Recognition of Handwritten Japanese Characters

This paper describes stochastic modeling of pencoordinate information in HMMs with structured character pattern representation (SCPR) for on-line recognition of handwritten Japanese characters. SCPR allows HMMs for Kanji character patterns to share common subpatterns. Although SCPR-based HMMs have been successfully applied to Kanji character recognition, the pen-coordinate feature has not been ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition

دوره 39  شماره 

صفحات  -

تاریخ انتشار 2006